Social Interaction of Humanoid RobotBased on Audio-Visual Tracking
نویسندگان
چکیده
Social interaction is essential in improving robot human interface. Such behaviors for social interaction may include to pay attention to a new sound source, to move toward it, or to keep face to face with a moving speaker. Some sound-centered behaviors may be difficult to attain, because the mixture of sounds is not well treated or auditory processing is too slow for real-time applications. Recently, Nakadai et al have developed real-time auditory and visual multiple-talker tracking technology by associating auditory and visual streams. The system is implemented on a upper-torso humanoid and the real-time talker tracking with 200 msec of delay is attained by distributed processing on four PCs connected by Gigabit Ethernet. Focus-of-attention is programmable and allows a variety of behaviors. This paper demonstrates a receptionist robot by focusing on an associated stream, while a companion robot on an auditory stream.
منابع مشابه
Realizing Audio-Visually Triggered ELIZA-Like Non-verbal Behaviors
We are studying how to create social physical agents, i.e., humanoids, that perform actions empowered by real-time audio-visual tracking of multiple talkers. Social skills require complex perceptual and motor capabilities as well as communicating ones. It is critical to identify primary features in designing building blocks for social skills, because performance of social interaction is usually...
متن کاملRealizing personality in audio-visually triggered non-verbal behaviors
Cantmlling mbot behaviors becomes more important recently as active perception for mbot, in particular active audition in addition to active vision, has made remarkable progress. We are studying how to ereale social humanoids that perform actions empowered by real-time audio-visual tracking of multiple talkers. In this paper, we present personality as a means ofcontmlling non-verbal behaviors. ...
متن کاملThe Vernissage Corpus: a Multimodal Human-robot-interaction Dataset
We introduce a new multimodal interaction dataset with extensive annotations in a conversational Human-RobotInteraction (HRI) scenario. It has been recorded and annotated to benchmark many relevant perceptual tasks, towards enabling a robot to converse with multiple humans, such as speaker localization, key word spotting, speech recognition in audio domain; tracking, pose estimation, nodding, v...
متن کاملAudio-Visual Perception System for a Humanoid Robotic Head
One of the main issues within the field of social robotics is to endow robots with the ability to direct attention to people with whom they are interacting. Different approaches follow bio-inspired mechanisms, merging audio and visual cues to localize a person using multiple sensors. However, most of these fusion mechanisms have been used in fixed systems, such as those used in video-conference...
متن کاملAudiovisual analysis of relations between laughter types and laughter motions
Laughter commonly occurs in daily interactions, and is not only simply related to funny situations, but also for expressing some type of attitude, having important social functions in communication. The background of the present work is generation of natural motions in a humanoid robot, so that miscommunication might be caused if there is mismatch between audio and visual modalities, especially...
متن کامل